FRASH: A framework to test algorithms of similarity hashing

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FRASH: A framework to test algorithms of similarity hashing

Automated input identification is a very challenging, but also important task. Within computer forensics this reduces the amount of data an investigator has to look at by hand. Besides identifying exact duplicates, which is mostly solved using cryptographic hash functions, it is necessary to cope with similar inputs (e.g., different versions of a file), embedded objects (e.g., a JPG within a Wo...

متن کامل

Sparse similarity-preserving hashing

In recent years, a lot of attention has been devoted to efficient nearest neighbor search by means of similarity-preserving hashing. One of the plights of existing hashing techniques is the intrinsic trade-off between performance and computational complexity: while longer hash codes allow for lower false positive rates, it is very difficult to increase the embedding dimensionality without incur...

متن کامل

Hashing for Similarity Search: A Survey

Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since...

متن کامل

Selection of Hashing Algorithms

INTRODUCTION The National Software Reference Library (NSRL) Reference Data Set (RDS) is built on file signature generation technology that is used primarily in cryptography. The selection of the specific file signature generation routines is based on customer requirements and the necessity to provide a level of confidence in the reference data that will allow it to be used in the U.S. Courts. T...

متن کامل

WaldHash: sequential similarity-preserving hashing

Similarity-sensitive hashing seeks compact representation of vector data as binary codes, so that the Hamming distance between code words approximates the original similarity. In this paper, we show that using codes of fixed length is inherently inefficient as the similarity can often be approximated well using just a few bits. We formulate a sequential embedding problem and approach similarity...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Digital Investigation

سال: 2013

ISSN: 1742-2876

DOI: 10.1016/j.diin.2013.06.006